Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor context attribute access in data modules #1319

Merged
merged 5 commits into from
Sep 20, 2024

Conversation

DanielAvdar
Copy link
Contributor

Add a context property to ensure consistent and easier access to the context session user across dynamodb.py and sql_alchemy.py modules. This change replaces direct references to context.session with self.context.session for better encapsulation and maintainability.

@dokterbob
Copy link
Collaborator

Please make sure to merge from master @DanielAvdar! Curious what you're proposing here but you definitely want to have my latest refactor in -- and I'd love to see unit tests! ❤️

@DanielAvdar
Copy link
Contributor Author

DanielAvdar commented Sep 11, 2024

unable to isolate instance of the data management class from application context, which required for writing tests.

@DanielAvdar DanielAvdar marked this pull request as ready for review September 11, 2024 19:54
@dokterbob
Copy link
Collaborator

dokterbob commented Sep 13, 2024

unable to isolate instance of the data management class from application context, which required for writing tests.

:/

If we can't test it that's definitely something we'll have to address. Could you describe the issues you're running into a bit better, perhaps I can help you move things forward.

I'm increasingly aiming to get all PR's covered by tests.

@dokterbob
Copy link
Collaborator

dokterbob commented Sep 18, 2024

🌈 Raining mad kudo's on you @DanielAvdar, great help!

Ironically, I just implemented similar test functionality as part of #1319

As you can see there, I'm trying to strictly follow the code file structure in my tests.

With your permission, I can integrate your work into my tests, that shouldn't be such a hassle. But let me know if you'd rather work on that yourself.

I love how you actually wrote decent data models for SQLAlchemy. I'd love to get that into the actual data layer implementation, once we've tests -- but preferably after we forked all data layers out into the new repo and packages (we're working on the proposal as we speak).

Why the original author used raw SQL boggles my mind. ;)

@dokterbob
Copy link
Collaborator

@DanielAvdar Would you prefer to refactor your tests to match mine or would you prefer I do it on your branch?

@DanielAvdar
Copy link
Contributor Author

DanielAvdar commented Sep 18, 2024

@DanielAvdar Would you prefer to refactor your tests to match mine or would you prefer I do it on your branch?

My branch, go ahead

Add a context property to ensure consistent and easier access to the context session user across dynamodb.py and sql_alchemy.py modules. This change replaces direct references to `context.session` with `self.context.session` for better encapsulation and maintainability.

Signed-off-by: DanielAvdar <[email protected]>
Introduce SQLite database tests and fixtures for pytest. This includes database schema definitions, temporary file handling, and data layer context management, along with updating dependencies in pyproject.toml and poetry.lock.

Signed-off-by: DanielAvdar <[email protected]>
@dokterbob
Copy link
Collaborator

@DanielAvdar As I went further into your code while working on the tests (I need to get a good grip on the data layers, anyways), I was surprised why you use outside context anyways. It seemed a very hack'ish solution (not just in your code), it makes the code hard to use and test (because of the side effects).

As we're working towards moving data layers fully out onto a separate project and repo, I hope to get rid of this type of strong coupling. (The same goes for whatever's happening in queue_until_user_message(), although that one seems to make less 'deep' assumptions.) If performance is the reason, I suggest to optimise queries and/or add explicit (self-contained) caching of some sort (in a separate PR!).

I've made a start and got rid of it in the SQLAlchemy layer, I would love to have a look at it (maybe there's something I missed?) and like to ask to do the same for your DynamoDB implementation.

I'm (more than) a bit curious why actual content is written to files rather than kept in the DB, but I guess that's a different discussion.

In addition, I really liked that you actually did data modelling to create the data structure. I have refrained from using it just for tests, because there's raw SQL in the docs and I want to make sure that what's documented will keep working.

But now that we have tests, I would love to use data models rather than raw SQL in there (might also take care of SQLite/JSON weirdness) and, ideally, also creation of data structure on model initialisation (so app startup). All of that in later PR's, please.

Want to clean house, possibly clean up data layer API before making it all fully pluggable. Also, we really should get rid of having two different user identifiers (uuid and email, why!???).

Please have a look, looking forward to hear from you! Also on ideas of our community repo (WIP)!

Data layers should be functional and self-consistent to allow for testing and predictable behaviour.
@DanielAvdar
Copy link
Contributor Author

DanielAvdar commented Sep 19, 2024

@DanielAvdar As I went further into your code while working on the tests (I need to get a good grip on the data layers, anyways), I was surprised why you use outside context anyways. It seemed a very hack'ish solution (not just in your code), it makes the code hard to use and test (because of the side effects).

As we're working towards moving data layers fully out onto a separate project and repo, I hope to get rid of this type of strong coupling. (The same goes for whatever's happening in queue_until_user_message(), although that one seems to make less 'deep' assumptions.) If performance is the reason, I suggest to optimise queries and/or add explicit (self-contained) caching of some sort (in a separate PR!).

I've made a start and got rid of it in the SQLAlchemy layer, I would love to have a look at it (maybe there's something I missed?) and like to ask to do the same for your DynamoDB implementation.

I'm (more than) a bit curious why actual content is written to files rather than kept in the DB, but I guess that's a different discussion.

In addition, I really liked that you actually did data modelling to create the data structure. I have refrained from using it just for tests, because there's raw SQL in the docs and I want to make sure that what's documented will keep working.

But now that we have tests, I would love to use data models rather than raw SQL in there (might also take care of SQLite/JSON weirdness) and, ideally, also creation of data structure on model initialisation (so app startup). All of that in later PR's, please.

Want to clean house, possibly clean up data layer API before making it all fully pluggable. Also, we really should get rid of having two different user identifiers (uuid and email, why!???).

Please have a look, looking forward to hear from you! Also on ideas of our community repo (WIP)!

@dokterbob
Thanks for the thorough code review and valuable suggestions! I appreciate your efforts to improve the codebase.

Regarding the use of outside context, it was initially introduced as a temporary workaround to address some coupling issues. I agree that it's not ideal, and I'm fully on board with your efforts to decouple the data layers.

As for storing data in files, it's primarily intended for testing purposes and serverless applications where traditional database access might be challenging. We can definitely discuss alternative approaches if you have concerns about this.

@dokterbob dokterbob merged commit 1964409 into Chainlit:main Sep 20, 2024
16 checks passed
dokterbob pushed a commit that referenced this pull request Oct 8, 2024
dokterbob pushed a commit that referenced this pull request Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants